52 research outputs found

    Dwarfs on Accelerators: Enhancing OpenCL Benchmarking for Heterogeneous Computing Architectures

    Full text link
    For reasons of both performance and energy efficiency, high-performance computing (HPC) hardware is becoming increasingly heterogeneous. The OpenCL framework supports portable programming across a wide range of computing devices and is gaining influence in programming next-generation accelerators. To characterize the performance of these devices across a range of applications requires a diverse, portable and configurable benchmark suite, and OpenCL is an attractive programming model for this purpose. We present an extended and enhanced version of the OpenDwarfs OpenCL benchmark suite, with a strong focus placed on the robustness of applications, curation of additional benchmarks with an increased emphasis on correctness of results and choice of problem size. Preliminary results and analysis are reported for eight benchmark codes on a diverse set of architectures -- three Intel CPUs, five Nvidia GPUs, six AMD GPUs and a Xeon Phi.Comment: 10 pages, 5 figure

    Retuning of Inferior Colliculus Neurons Following Spiral Ganglion Lesions: A Single-Neuron Model of Converging Inputs

    Get PDF
    Lesions of spiral ganglion cells, representing a restricted sector of the auditory nerve array, produce immediate changes in the frequency tuning of inferior colliculus (IC) neurons. There is a loss of excitation at the lesion frequencies, yet responses to adjacent frequencies remain intact and new regions of activity appear. This leads to immediate changes in tuning and in tonotopic progression. Similar effects are seen after different methods of peripheral damage and in auditory neurons in other nuclei. The mechanisms that underlie these postlesion changes are unknown, but the acute effects seen in IC strongly suggest the “unmasking” of latent inputs by the removal of inhibition. In this study, we explore computational models of single neurons with a convergence of excitatory and inhibitory inputs from a range of characteristic frequencies (CFs), which can simulate the narrow prelesion tuning of IC neurons, and account for the changes in CF tuning after a lesion. The models can reproduce the data if inputs are aligned relative to one another in a precise order along the dendrites of model IC neurons. Frequency tuning in these neurons approximates that seen physiologically. Removal of inputs representing a narrow range of frequencies leads to unmasking of previously subthreshold excitatory inputs, which causes changes in CF. Conversely, if all of the inputs converge at the same point on the cell body, receptive fields are broad and unmasking rarely results in CF changes. However, if the inhibition is tonic with no stimulus-driven component, then unmasking can still produce changes in CF

    An Estimate of Avian Mortality at Communication Towers in the United States and Canada

    Get PDF
    Avian mortality at communication towers in the continental United States and Canada is an issue of pressing conservation concern. Previous estimates of this mortality have been based on limited data and have not included Canada. We compiled a database of communication towers in the continental United States and Canada and estimated avian mortality by tower with a regression relating avian mortality to tower height. This equation was derived from 38 tower studies for which mortality data were available and corrected for sampling effort, search efficiency, and scavenging where appropriate. Although most studies document mortality at guyed towers with steady-burning lights, we accounted for lower mortality at towers without guy wires or steady-burning lights by adjusting estimates based on published studies. The resulting estimate of mortality at towers is 6.8 million birds per year in the United States and Canada. Bootstrapped subsampling indicated that the regression was robust to the choice of studies included and a comparison of multiple regression models showed that incorporating sampling, scavenging, and search efficiency adjustments improved model fit. Estimating total avian mortality is only a first step in developing an assessment of the biological significance of mortality at communication towers for individual species or groups of species. Nevertheless, our estimate can be used to evaluate this source of mortality, develop subsequent per-species mortality estimates, and motivate policy action

    A novel receptor-type protein tyrosine phosphatase is expressed during neurogenesis in the olfactory neuroepithelium

    Full text link
    Tyrosine phosphorylation plays a central role in the control of neuronal cell development and function. Yet, few neuronal protein tyrosine phosphatases (PTPs) have been identified. We examined rat olfactory neuroepithelium for expression of novel PTPs potentially important in neuronal development and regeneration. Using the polymerase chain reaction with degenerate DNA oligomers directed to the conserved tyrosine phosphatase domain, we identified 6 novel tyrosine phosphatases. One of these, PTP NE-3, is a receptor-type PTP expressed selectively in both rat brain and olfactory neuroepithelium. In the olfactory neuroepithelium, PTP NE-3 expression is restricted to neurons and describes a novel pattern of expression with a high level in the immature neurons and a lower level in mature olfactory sensory neurons.Peer Reviewedhttp://deepblue.lib.umich.edu/bitstream/2027.42/30663/1/0000306.pd

    Guidelines for the use and interpretation of assays for monitoring autophagy (4th edition)1.

    Get PDF
    In 2008, we published the first set of guidelines for standardizing research in autophagy. Since then, this topic has received increasing attention, and many scientists have entered the field. Our knowledge base and relevant new technologies have also been expanding. Thus, it is important to formulate on a regular basis updated guidelines for monitoring autophagy in different organisms. Despite numerous reviews, there continues to be confusion regarding acceptable methods to evaluate autophagy, especially in multicellular eukaryotes. Here, we present a set of guidelines for investigators to select and interpret methods to examine autophagy and related processes, and for reviewers to provide realistic and reasonable critiques of reports that are focused on these processes. These guidelines are not meant to be a dogmatic set of rules, because the appropriateness of any assay largely depends on the question being asked and the system being used. Moreover, no individual assay is perfect for every situation, calling for the use of multiple techniques to properly monitor autophagy in each experimental setting. Finally, several core components of the autophagy machinery have been implicated in distinct autophagic processes (canonical and noncanonical autophagy), implying that genetic approaches to block autophagy should rely on targeting two or more autophagy-related genes that ideally participate in distinct steps of the pathway. Along similar lines, because multiple proteins involved in autophagy also regulate other cellular pathways including apoptosis, not all of them can be used as a specific marker for bona fide autophagic responses. Here, we critically discuss current methods of assessing autophagy and the information they can, or cannot, provide. Our ultimate goal is to encourage intellectual and technical innovation in the field

    Guidelines for the use and interpretation of assays for monitoring autophagy (4th edition)

    Get PDF

    Characterizing and Predicting Scientific Workloads for Heterogeneous Computing Systems

    Get PDF
    The next-generation of supercomputers will feature a diverse mix of accelerator devices. The increase of heterogeneity is explained by the nature of these devices - certain accelerators offer acceleration, or a shorter time to completion, for particular programs. Characteristics of these programs are fixed regardless of which accelerator is used for computation; for instance, a graph traversal program always exhibits the properties of graph traversal regardless of what device it is executed. This work presents a methodology to collect these characteristics and use them to inform the selection of optimal accelerator device. On HPC systems a single node may feature a GPU, CPU, and an FPGA or MIC. The focus of this work is to schedule scientific codes to the most suitable device on a node which offers a more efficient system - shorter execution times and less energy expenditure. OpenCL is an attractive programming model for high-performance computing systems, with wide support from hardware vendors it is a highly portable language - a single implementation can execute on CPU, GPU, MIC and FPGA alike. To support efficient scheduling on HPC systems it is necessary to perform accurate performance predictions for OpenCL workloads on varied compute devices, which is challenging due to diverse computation, communication and memory access characteristics which result in varying performance between devices. The first focus of this work is to present a comprehensive benchmark suite for OpenCL in the heterogeneous HPC setting: an extended and enhanced version of the Open-Dwarfs OpenCL benchmark suite. My extensions improve portability and robustness of applications, correctness of results and choice of problem size, and increase diversity through coverage of additional application patterns. This work manifests in performance measurements on a set 15 devices and over 11 applications. We next present the Architecture Independent Workload Characterization (AIWC) tool which characterizes OpenCL kernels according to a set of architecture-independent features. Features are measured by counting target characteristics which are collected during program execution in a simulator. They are presented as 42 metrics that indicate performance bottlenecks ranging from parallelism - how well an algorithm scales in response to core count, compute - such as the diversity of instructions, memory - working memory footprint and entropy measurements which correspond to caching characteristics and control - such as branching and program flow. The metrics collected are primarily used in the prediction of execution times, but since they are representative of structural characteristics of the underlying program and are free from architectural traits, they can be used in diversity analysis in benchmark suites, identifying program requirements which allows the automatic calculation of theoretical peak performance for a given device and examining phase-transitional properties of application codes. This work also discusses the design decisions made to collect AIWC features. Finally, this work culminates in a methodology which uses AIWC features to form a model capable of predicting accelerator execution times. I use this methodology to predict execution times for a set of 37 computational kernels running on 15 different devices representing a broad range of CPU, GPU and MIC architectures. The predictions are highly accurate,differing from the measured experimental run-times by an average of only 1.2%. A previously unencountered code can be instrumented once and the AIWC metrics embedded in the kernel, to allow performance prediction across the full range of modeled devices. The results suggest that this methodology supports correct selection of the most appropriate device for a previously unencountered code, and is highly relevant to efficiently scheduling codes to the emerging supercomputing systems where nodes are becoming increasingly heterogeneous
    corecore